AITopics | gated recurrent unit

Collaborating Authors

gated recurrent unit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Preventing Gradient Explosions in Gated Recurrent Units

Neural Information Processing SystemsNov-21-2025, 16:16:23 GMT

A gated recurrent unit (GRU) is a successful recurrent neural network architecture for time-series data. The GRU is typically trained using a gradient-based method, which is subject to the exploding gradient problem in which the gradient increases significantly. This problem is caused by an abrupt change in the dynamics of the GRU due to a small variation in the parameters. In this paper, we find a condition under which the dynamics of the GRU changes drastically and propose a learning method to address the exploding gradient problem. Our method constrains the dynamics of the GRU so that it does not drastically change. We evaluated our method in experiments on language modeling and polyphonic music modeling. Our experiments showed that our method can prevent the exploding gradient problem and improve modeling accuracy.

gated recurrent unit, name change, preventing gradient explosion, (4 more...)

Neural Information Processing Systems

Industry:

Media > Music (0.62)
Leisure & Entertainment (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Convolutional Spiking-based GRU Cell for Spatio-temporal Data

Abdennadher, Yesmine, Cicciarella, Eleonora, Rossi, Michele

arXiv.org Artificial IntelligenceOct-30-2025

Spike-based temporal messaging enables SNNs to efficiently process both purely temporal and spatio-temporal time-series or event-driven data. Combining SNNs with Gated Recurrent Units (GRUs), a variant of recurrent neural networks, gives rise to a robust framework for sequential data processing; however, traditional RNNs often lose local details when handling long sequences. Previous approaches, such as SpikGRU, fail to capture fine-grained local dependencies in event-based spatio-temporal data. In this paper, we introduce the Convolutional Spiking GRU (CS-GRU) cell, which leverages convolutional operations to preserve local structure and dependencies while integrating the temporal precision of spiking neurons with the efficient gating mechanisms of GRUs. This versatile architecture excels on both temporal datasets (NTIDIGITS, SHD) and spatio-temporal benchmarks (MNIST, DVSGesture, CIFAR10DVS). Our experiments show that CS-GRU outperforms state-of-the-art GRU variants by an average of 4.35%, achieving over 90% accuracy on sequential tasks and up to 99.31% on MNIST. It is worth noting that our solution achieves 69% higher efficiency compared to SpikGRU. The code is available at: https://github.com/YesmineAbdennadher/CS-GRU.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/MLSP62443.2025.11204318

2510.25696

Genre: Research Report (0.82)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GRAD: Real-Time Gated Recurrent Anomaly Detection in Autonomous Vehicle Sensors Using Reinforced EMA and Multi-Stage Sliding Window Techniques

Naeimi, Mohammad Hossein Jafari, Norouzi, Ali, Abdi, Athena

arXiv.org Artificial IntelligenceOct-28-2025

This paper introduces GRAD, a real-time anomaly detection method for autonomous vehicle sensors that integrates statistical analysis and deep learning to ensure the reliability of sensor data. The proposed approach combines the Reinforced Exponential Moving Average (REMA), which adapts smoothing factors and thresholding for outlier detection, with the Multi-Stage Sliding Window (MS-SW) technique for capturing both short- and long-term patterns. These features are processed using a lightweight Gated Recurrent Unit (GRU) model, which detects and classifies anomalies based on bias types, while a recovery module restores damaged sensor data to ensure continuous system operation. GRAD has a lightweight architecture consisting of two layers of GRU with a limited number of neurons that make it appropriate for real-time applications while maintaining high detection accuracy. The GRAD framework achieved remarkable performance in anomaly detection and classification. The model demonstrated an overall F1-score of 97.6% for abnormal data and 99.4% for normal data, signifying its high accuracy in distinguishing between normal and anomalous sensor data. Regarding the anomaly classification, GRAD successfully categorized different anomaly types with high precision, enabling the recovery module to accurately restore damaged sensor data. Relative to analogous studies, GRAD surpasses current models by attaining a balance between elevated detection accuracy and diminished computational expense. These results demonstrate GRAD's potential as a reliable and efficient solution for real-time anomaly detection in autonomous vehicle systems, guaranteeing safe vehicle operation with minimal computational overhead.

data mining, machine learning, real time system, (17 more...)

arXiv.org Artificial Intelligence

2510.23327

Country: Asia > Middle East (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.93)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Add feedback

Predicting Road Crossing Behaviour using Pose Detection and Sequence Modelling

Dasgupta, Subhasis, Saha, Preetam, Roy, Agniva, Sen, Jaydip

arXiv.org Artificial IntelligenceAug-22-2025

The world is rapidly advancing toward a future where artificial intelligence (AI) takes a central role in many everyday activities. In business, for example, robots have become indispensable in manufacturing processes and warehouse management. These robots efficiently handle tasks such as stacking and removing items, o ptimizing various business operations. In aviation, autopilot systems have been a standard feature in airplanes for many years, enhancing flight safety and efficiency. Similarly, in many developed countries, vehicles equipped with autopilot capabilities ar e becoming increasingly common. These self - driving vehicles are designed with an array of sensors and high - resolution cameras to monitor their surroundings, detect objects, and take necessary actions to prevent collisions or accidents. While these autonomous vehicles perform admirably on highways where the primary concern is other vehicles, they face significant challenges in busy urban environments. In such settings, it is often advisable for drivers to switch from autopilot to manual c ontrol. This is particularly crucial in bustling market areas where pedestrian behaviour can be unpredictable.

artificial intelligence, machine learning, video, (20 more...)

arXiv.org Artificial Intelligence

2508.15336

Country: Asia > India (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Air (0.54)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MINIMALIST: switched-capacitor circuits for efficient in-memory computation of gated recurrent units

Billaudelle, Sebastian, Kriener, Laura, Moro, Filippo, Torchet, Tristan, Payvand, Melika

arXiv.org Artificial IntelligenceMay-14-2025

Recurrent neural networks (RNNs) have been a long-standing candidate for processing of temporal sequence data, especially in memory-constrained systems that one may find in embedded edge computing environments. Recent advances in training paradigms have now inspired new generations of efficient RNNs. We introduce a streamlined and hardware-compatible architecture based on minimal gated recurrent units (GRUs), and an accompanying efficient mixed-signal hardware implementation of the model. The proposed design leverages switched-capacitor circuits not only for in-memory computation (IMC), but also for the gated state updates. The mixed-signal cores rely solely on commodity circuits consisting of metal capacitors, transmission gates, and a clocked comparator, thus greatly facilitating scaling and transfer to other technology nodes. We benchmark the performance of our architecture on time series data, introducing all constraints required for a direct mapping to the hardware system. The direct compatibility is verified in mixed-signal simulations, reproducing data recorded from the software-only network model.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.08599

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Benchmarking Traditional Machine Learning and Deep Learning Models for Fault Detection in Power Transformers

Saravanan, Bhuvan, D, Pasanth Kumar M, Vengateson, Aarnesh

arXiv.org Artificial IntelligenceMay-13-2025

Accurate diagnosis of power transformer faults is essential for ensuring the stability and safety of electrical power systems. This study presents a comparative analysis of conventional machine learning (ML) algorithms and deep learning (DL) algorithms for fault classification of power transformers. Using a condition-monitored dataset spanning 10 months, various gas concentration features were normalized and used to train five ML classifiers: Support Vector Machine (SVM), k-Nearest Neighbors (KNN), Random Forest (RF), XGBoost, and Artificial Neural Network (ANN). In addition, four DL models were evaluated: Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), One-Dimensional Convolutional Neural Network (1D-CNN), and TabNet. Experimental results show that both ML and DL approaches performed comparably. The RF model achieved the highest ML accuracy at 86.82%, while the 1D-CNN model attained a close 86.30%.

accuracy, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.06295

Country: Asia > India (0.05)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Deep Knowledge Tracing via Gated Architectures and Adaptive Optimization

Shukurlu, Altun

arXiv.org Artificial IntelligenceApr-30-2025

Deep Knowledge Tracing (DKT) models student learning behavior by using Recurrent Neural Networks (RNNs) to predict future performance based on historical interaction data. However, the original implementation relied on standard RNNs in the Lua-based Torch framework, which limited extensibility and reproducibility. In this work, we revisit the DKT model from two perspectives: architectural improvements and optimization efficiency. First, we enhance the model using gated recurrent units, specifically Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRU), which better capture long-term dependencies and help mitigate vanishing gradient issues. Second, we re-implement DKT using the PyTorch framework, enabling a modular and accessible infrastructure compatible with modern deep learning workflows. We also benchmark several optimization algorithms SGD, RMSProp, Adagrad, Adam, and AdamW to evaluate their impact on convergence speed and predictive accuracy in educational modeling tasks. Experiments on the Synthetic-5 and Khan Academy datasets show that GRUs and LSTMs achieve higher accuracy and improved training stability compared to basic RNNs, while adaptive optimizers such as Adam and AdamW consistently outperform SGD in both early-stage learning and final model performance. Our open-source PyTorch implementation provides a reproducible and extensible foundation for future research in neural knowledge tracing and personalized learning systems.

artificial intelligence, knowledge tracing, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2504.2007

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > New Finding (0.46)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.89)
Education > Educational Setting (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Estimating Vehicle Speed on Roadways Using RNNs and Transformers: A Video-based Approach

Mareddy, Sai Krishna Reddy, Upplapati, Dhanush, Antharam, Dhanush Kumar

arXiv.org Artificial IntelligenceFeb-21-2025

This project explores the application of advanced machine learning models, specifically Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Transformers, to the task of vehicle speed estimation using video data. Traditional methods of speed estimation, such as radar and manual systems, are often constrained by high costs, limited coverage, and potential disruptions. In contrast, leveraging existing surveillance infrastructure and cutting-edge neural network architectures presents a non-intrusive, scalable solution. Our approach utilizes LSTM and GRU to effectively manage long-term dependencies within the temporal sequence of video frames, while Transformers are employed to harness their self-attention mechanisms, enabling the processing of entire sequences in parallel and focusing on the most informative segments of the data. This study demonstrates that both LSTM and GRU outperform basic Recurrent Neural Networks (RNNs) due to their advanced gating mechanisms. Furthermore, increasing the sequence length of input data consistently improves model accuracy, highlighting the importance of contextual information in dynamic environments. Transformers, in particular, show exceptional adaptability and robustness across varied sequence lengths and complexities, making them highly suitable for real-time applications in diverse traffic conditions. The findings suggest that integrating these sophisticated neural network models can significantly enhance the accuracy and reliability of automated speed detection systems, thus promising to revolutionize traffic management and road safety.

speed estimation, transformer, vehicle speed estimation, (13 more...)

arXiv.org Artificial Intelligence

2502.15545

Country:

North America > United States > North Carolina (0.05)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
Europe > Finland > Pirkanmaa > Tampere (0.04)

Genre: Research Report (0.84)

Industry:

Education > Educational Setting (0.48)
Transportation > Ground > Road (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Integrative Analysis of Financial Market Sentiment Using CNN and GRU for Risk Prediction and Alert Systems

Wu, You, Sun, Mengfang, Zheng, Hongye, Hu, Jinxin, Liang, Yingbin, Lin, Zhenghao

arXiv.org Artificial IntelligenceDec-13-2024

This document presents an in-depth examination of stock market sentiment through the integration of Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU), enabling precise risk alerts. The robust feature extraction capability of CNN is utilized to preprocess and analyze extensive network text data, identifying local features and patterns. The extracted feature sequences are then input into the GRU model to understand the progression of emotional states over time and their potential impact on future market sentiment and risk. This approach addresses the order dependence and long-term dependencies inherent in time series data, resulting in a detailed analysis of stock market sentiment and effective early warnings of future risks.

gru, neural network, prediction, (14 more...)

arXiv.org Artificial Intelligence

2412.10199

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Asia > China > Hong Kong (0.04)
North America > United States > Arizona (0.04)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Preventing Gradient Explosions in Gated Recurrent Units

Neural Information Processing SystemsOct-8-2024, 12:50:23 GMT

Summary The authors propose a method for optimizing GRU networks which aims to prevent exploding gradients. They motivate the method by showing that a constraint on the spectral norm of the state-to-state matrix keeps the dynamics of the network stable near the fixed point 0. The method is evaluated on language modelling and a music prediction task and leads to stable training in comparison to weight clipping. Technical quality The motivation of the method is well developed and it is nice that the method is evaluated on two different real-world datasets. However, one important issue I have with the evaluation is that the learning rate is not controlled for in the experiments. Unfortunately, this makes it hard to draw strong conclusions from the results.

gated recurrent unit, grus, preventing gradient explosion, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.65)

Add feedback